Database Structures , Based on Tries , for Text
نویسندگان
چکیده
Digital trees, or tries, were introduced thirty years ago for sublinear-time retrieval of substrings from large texts. They were exploited for this, as a well-known example, by the University of Waterloo project to put the New Oxford English Dictionary onto CD-ROM. We have recently improved the performance of trie techniques for text and shown their use in searches for approximations to a given string. We have also shown that tries have excellent retrieval properties for spatial data. We have shown how to use tries to represent, without redundancy, spatial data which can be displayed to any resolution, retrieving from disk or from network only the amount of data that will nally be displayed. We have done this particularly for two-dimensional vector data, such as makes up very large maps, but have also established that the trie techniques apply to raster data and to data of other than two dimensions. These results are the basis for a claim that tries ooer the best storage representations for large-scale multimedia databases. We are presently pursuing this claim by developing trie techniques for general data and queries. We give some results for multikey data.
منابع مشابه
Image retrieval using the combination of text-based and content-based algorithms
Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input...
متن کاملThe Impact of Contextual Clue Selection on Inference
Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with...
متن کاملFaster Searching in Tries and Quadtrees - An Analysis of Level Compression
We analyze the behavior of the level-compressed trie, LC-trie, a compact version of the standard trie data structure. Based on this analysis, we argue that level compression improves the performance of both tries and quadtrees considerably in many practical situations. In particular, we show that LC-tries can be of great use for string searching in compressed text. Both tries and quadtrees are ...
متن کاملSyntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity
In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...
متن کاملSurvey on Information Retrieval and Pattern Matching for Compressed Data Size using the SVD Technique on Real Audio Dataset
Due to increasing size of text and audio data over internet, various techniques are needed to help with the finding and extraction of very specific information relevant to a user's task. Text mining is a variant on a field called data mining that tries to discover curious patterns from large databases. Singular value decomposition this technique is used for dimensionality reduction of large dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996